Goto

Collaborating Authors

 physical skill


GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

arXiv.org Artificial Intelligence

Learning open-vocabulary physical skills for simulated agents presents a significant challenge in artificial intelligence. Current reinforcement learning approaches face critical limitations: manually designed rewards lack scalability across diverse tasks, while demonstration-based methods struggle to generalize beyond their training distribution. We introduce GROVE, a generalized reward framework that enables open-vocabulary physical skill learning without manual engineering or task-specific demonstrations. Our key insight is that Large Language Models(LLMs) and Vision Language Models(VLMs) provide complementary guidance -- LLMs generate precise physical constraints capturing task requirements, while VLMs evaluate motion semantics and naturalness. Through an iterative design process, VLM-based feedback continuously refines LLM-generated constraints, creating a self-improving reward system. To bridge the domain gap between simulation and natural images, we develop Pose2CLIP, a lightweight mapper that efficiently projects agent poses directly into semantic feature space without computationally expensive rendering. Extensive experiments across diverse embodiments and learning paradigms demonstrate GROVE's effectiveness, achieving 22.2% higher motion naturalness and 25.7% better task completion scores while training 8.4x faster than previous methods. These results establish a new foundation for scalable physical skill acquisition in simulated environments.


PhyPlan: Compositional and Adaptive Physical Task Reasoning with Physics-Informed Skill Networks for Robot Manipulators

arXiv.org Artificial Intelligence

Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel physics-informed planning framework that combines physics-informed neural networks (PINNs) with modified Monte Carlo Tree Search (MCTS) to enable embodied agents to perform dynamic physical tasks. PhyPlan leverages PINNs to simulate and predict outcomes of actions in a fast and accurate manner and uses MCTS for planning. It dynamically determines whether to consult a PINN-based simulator (coarse but fast) or engage directly with the actual environment (fine but slow) to determine optimal policy. Evaluation with robots in simulated 3D environments demonstrates the ability of our approach to solve 3D-physical reasoning tasks involving the composition of dynamic skills. Quantitatively, PhyPlan excels in several aspects: (i) it achieves lower regret when learning novel tasks compared to state-of-the-art, (ii) it expedites skill learning and enhances the speed of physical reasoning, (iii) it demonstrates higher data efficiency compared to a physics un-informed approach.


Learning Reward for Physical Skills using Large Language Model

arXiv.org Artificial Intelligence

Learning reward functions for physical skills are challenging due to the vast spectrum of skills, the high-dimensionality of state and action space, and nuanced sensory feedback. The complexity of these tasks makes acquiring expert demonstration data both costly and time-consuming. Large Language Models (LLMs) contain valuable task-related knowledge that can aid in learning these reward functions. However, the direct application of LLMs for proposing reward functions has its limitations such as numerical instability and inability to incorporate the environment feedback. We aim to extract task knowledge from LLMs using environment feedback to create efficient reward functions for physical skills. Our approach consists of two components. We first use the LLM to propose features and parameterization of the reward function. Next, we update the parameters of this proposed reward function through an iterative self-alignment process. In particular, this process minimizes the ranking inconsistency between the LLM and our learned reward functions based on the new observations. We validated our method by testing it on three simulated physical skill learning tasks, demonstrating effective support for our design choices.


Dexterity holding back rise of the robots ... for now

#artificialintelligence

Woodside Petroleum's chief technology officer believes robots with comparable human dexterity skills are still five to 10 years away. Shaun Gregory told the Resources Technology Showcase that the limited dexterity offered by current robotics technologies suggested there was little immediate danger of humans losing physical skills. "We may lose those physical skills (but) when you look at the robots we are using, certainly not in the very near term," Mr Gregory said. While existing robots had a clasp that enabled them to perform some simple tasks, it was "nowhere near as dexterous as a human hand". Woodside has embraced robotics, advanced sensor design and deployment, artificial intelligence, data science and visualisation, even collaborating with NASA, as a means of reducing costs and improving the efficiency and safety of its oil and gas operations.


Human-machine collaboration and the future of work

#artificialintelligence

We naturally think of "intelligence" as a trait belonging to individuals. We're all--students, employees, soldiers, artists, athletes--regularly evaluated in terms of personal accomplishment, with "lone hero" narratives prevailing in accounts of scientific discovery, politics, and business. Similarly, artificial intelligence is typically defined as a quest to build individual machines that possess different forms of intelligence, even the kind of general intelligence measured in humans for more than a century. Yet focusing on individual intelligence, whether human or machine, can distract us from the true nature of accomplishment. As Thomas Malone, professor at MIT's Sloan School of Management and director of its Center for Collective Intelligence notes: "Almost everything we humans have ever done has been done not by lone individuals, but by groups of people working together, often across time and space." Malone, the author of 2004's The Future of Work and a pioneering researcher in the field of collective intelligence, is in a singular position to understand the potential of AI technologies to transform workers, workplaces, and societies. In this conversation with Deloitte's Jim Guszcza and Jeff Schwartz, he discusses a vision outlined in his recent book Superminds--a framework for achieving new forms of human-machine collective intelligence and its implications for the future of work. Can you tell us what a "supermind" is, and how you define collective intelligence? Thomas Malone, director, MIT Center for Collective Intelligence: A "supermind" is a group of individuals acting collectively in ways that seem intelligent, and collective intelligence essentially has the same definition. For many years, I defined collective intelligence as groups of individuals acting collectively in ways that seem intelligent. But I think it's probably more useful to think of collective intelligence as the property that a supermind has.


7 Jobs Humans Can Do Better Than Robots And AI

#artificialintelligence

The rise of artificial intelligence or AI in the past decades has resulted in collective anxiety around the world. The common apprehension is that there will be massive loss of jobs as robots and computers eventually replace employees. The fear is not without some basis; after all, robots and computers have proven to be far better than humans at executing certain tasks. However, it must be noted that not all jobs will eventually be replaced by AI. The answer lies in knowing both what AI is capable of doing better than humans and what humans are capable of doing better than AI. It is no secret that many jobs have been taken over by AI.


Competitive Video Gaming Could Be the Newest Olympic Sport. Here's What It Would Look Like

TIME - Tech

The space could pass for a TV control room at any major sporting event. A few dozen workers were wearing headsets, looking at a maze of computer screens. But here, footsteps away from the frigid beach in the coastal South Korean city of Gangneung--which is hosting figure skating, hockey and other arena ice events at the 2018 Winter Olympic Games in PyeongChang--NBC wasn't covering curling. Instead ESL, a company that organizes competitions in e-sports, or competitive video gaming, was streaming the semi-finals of a tournament hosted by Olympic sponsor Intel on the Olympic Channel. The stakes were particularly high, and not just for the competitors.


Personalized Instruction of Physical Skills with a Social Robot

AAAI Conferences

While robots have been used extensively for the purpose of teaching symbolic knowledge, using robots to teach or refine motor skills of humans, such as swinging a bat, or shooting a basketball, is underserved. Robots are uniquely well situated to observe physical movements, identify problems, prioritize which problems to address first, and to patiently communicate personalized advice to the student. We propose an architecture to coach physical skills, and focus on the second and third of these challenges - identifying problems with the movements, and prioritizing which to address first - as applied to the domain of shooting a basketball. We present a supervised learning approach to prioritize which problems to work on, and propose the design of several user studies that will determine the effectiveness of the algorithm.